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Abstract 

To recover a sparse signal from an underdetermined system, we often solve a constrained ^i-norm minimization 
problem. In many cases, the signal sparsity and the recovery performance can be further improved by replacing 
the l\ norm with a "weighted" l\ norm. Without any prior information about nonzero elements of the signal, the 
procedure for selecting weights is iterative in nature. Common approaches update the weights at every iteration 
using the solution of a weighted l\ problem from the previous iteration. 
^ In this paper, we present two homotopy-based algorithms that efficiently solve reweighted l\ problems. First, 

we present an algorithm that quickly updates the solution of a weighted l\ problem as the weights change. Since the 
solution changes only slightly with small changes in the weights, we develop a homotopy algorithm that replaces 
the old weights with the new ones in a small number of computationally inexpensive steps. Second, we propose an 
algorithm that solves a weighted l\ problem by adaptively selecting the weights while estimating the signal. This 
algorithm integrates the reweighting into every step along the homotopy path by changing the weights according to 
the changes in the solution and its support, allowing us to achieve a high quality signal reconstruction by solving a 
single homotopy problem. We compare the performance of both algorithms, in terms of reconstruction accuracy and 
r \ computational complexity, against state-of-the-art solvers and show that our methods have smaller computational cost. 

In addition, we will show that the adaptive selection of the weights inside the homotopy often yields reconstructions 
"£3 of higher quality. 

I. Introduction 

J> We consider the fundamental problem of recovering a signal from (possibly incomplete) linear, noisy measure- 
1—1 ments. We observe y G M M as 

[g y = Ax + e, (1) 

^ where x G M. N is an unknown signal of interest that is measured through anMxJV matrix A, and e € M. M is noise. 
^ Recent work in compressive sensing and sparse approximation has shown that if x is sparse and A obeys certain 
"incoherence" conditions, then a stable recovery is possible [1-4]. For instance, we can estimate x by solving the 
t-h following convex optimization program: 

minimize r x 1 -\ — Ax-y 2 , (2) 

X x 2 

^ where the l\ term promotes sparsity in the solution, the £2 term keeps the solution close to the measurements, 
and r > is a user-selected regularization parameter. The program (2), commonly known as the LASSO or Basis 
Pursuit Denoising [5,6], yields good numerical results in a variety of problems and comes with strong theoretical 
guarantees [6-9]. 

Replacing the i\ norm in (2) with a "weighted" l\ norm can often enhance the sparsity of the solution and improve 
the signal recovery performance [10-15]. The weighted ^i-norm minimization form of (2) can be described as 

N 



^Wj|xj| + -||Ax-y|||, (3) 



minimize , 
x * — ' 2 

i=l 

where Wj > denotes the weight at index i. We can adjust the w; in (3) to selectively penalize different coefficients 
in the solution. To promote the same sparsity structure in the solution that is present in the original signal, we can 
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select the w« such that they have small values on the nonzero locations of the signal and significantly larger values 
elsewhere. Since information about the locations and amplitudes of the nonzero coefficients of the original signal 
is not available a priori, the critical task of selecting the weights is performed iteratively. Common approaches for 
such "iterative reweighting" re-compute weights at every iteration using the solution of (3) at the previous iteration. 
Suppose x denotes the solution of (3) for a given set of weights. For the next iteration, we compute the w$ as 



for i = 1,. . ., N, using an appropriate choice of positive values for parameters r and e. We use these updated 
weights in (3) to compute an updated signal estimate, which we then use in (4) to re-compute the weights for the 
next iteration. The major computational cost of every iteration in such a reweighting scheme arises from solving 
(3), for which a number of solvers are available [16-21]. 

In this paper, we present two homotopy-based algorithms for efficiently solving reweighted £i-norm minimization 
problems. In a typical homotopy algorithm for an t\ problem, as the homotopy parameter changes, the solution 
moves along a piecewise-linear path, and each segment on this homotopy path is traced with a computationally 
inexpensive homotopy step. The major computational cost for every homotopy step involves one full matrix-vector 
multiplication and one rank-one update of the inverse of a small matrix. A well-known example is the standard 
LASSO homotopy in which we trace a solution path for (2) by reducing the single parameter r while updating the 
support of the solution by one element at every step [22-24]. By comparison, (3) has N parameters in the form of 
Wj, and both the homotopy algorithms we present in this paper change the w$ in such a way that their respective 
solutions follow piecewise-linear paths in sequences of inexpensive homotopy steps. 

First, we present an algorithm that quickly updates the solution of (3) as the weights change in the iterative 
reweighting framework. Suppose we have the solution of (3) for a given set of weights Wj and we wish to update 
the weights to Wj. We develop a homotopy program that updates the solution of (3) by replacing the old weights 
(Wj) with the new ones (w«). Since the solution of (3) changes only slightly with small changes in the weights, the 
homotopy procedure utilizes information about an existing solution to update the solution in only a small number 
of inexpensive homotopy steps. 

Second, we propose a new homotopy algorithm that performs an internal "adaptive reweighting" after every 
homotopy step. Our algorithm yields a solution for a weighted t\ problem of the form (3) for which the final 
values of Wj are not assigned a priori, but instead are adaptively selected inside the algorithm. In our proposed 
homotopy algorithm, we follow a solution path for (3) by adaptively reducing each Wj, while updating the support 
of the solution by one element at every step. After every homotopy step, we adjust the weights according to the 
changes in the support of the solution so that the Wj on the support shrink at a faster rate, toward smaller values 
(e.g., of the form in (4)), while the Wj elsewhere shrink at a slower rate, toward a predefined threshold (r > 0). 
This allows us to recover a high-quality signal by solving a single homotopy problem, instead of solving (3) 
multiple times via iterative reweighting (i.e., updating Wj after solving (3)). We have also observed that such an 
adaptive reweighting tends to provide better quality of reconstruction compared to the standard method of iterative 
reweighting. In addition to assigning smaller weights to the active indices, this adaptive reweighting serves another 
purpose: it encourages active elements to remain nonzero, which in turn reduces the total number of homotopy 
steps required for solving the entire problem. 

Our proposed adaptive reweighting method bears some resemblance to a variable selection method recently 
presented in [25], which adjusts the level of shrinkage at each step (that is equivalent to reducing the Wj toward 
zero) so as to optimize the selection of the next variable. However, the procedure we adopt for the selection of Wj 
in this paper is more flexible, and it offers an explicit control over the values of Wj, which we exploit to embed a 
reweighted £i-norm regularization inside the homotopy. 

The paper is organized as follows. In Section II, we briefly discuss the homotopy algorithm for (2), on which we 
then build our discussion of the two homotopy algorithms for solving reweighted l\ problem. In Section III, we 
present numerical experiments that compare performance of our proposed algorithms, in terms of reconstruction 
accuracy and computational complexity, against three state-of-the-art solvers for which we use old solutions as 
warm start during iterative reweighting. 
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II. Algorithms 

A. LASSO homotopy 

The well-known LASSO homotopy algorithm solves (2) for one desired value of r by tracing the entire solution 
path for a range of decreasing values of r (i.e., any point on the so-called homotopy path is a solution of (2) for a 
certain value of r) [22,23]. Starting with a large value of r, LASSO homotopy shrinks r toward its final value in 
a sequence of computationally inexpensive steps. The fundamental insight is that as r changes, the solution of (2) 
follows a piecewise-linear path in which the length and the direction of each segment is completely determined by 
the support and the sign sequence of the solution on that segment. This fact can be derived by analyzing the KKT 
optimality conditions for (2), as given below in (5) [26,27]. The support of the solution changes only at certain 
critical values of r, when either a new nonzero element enters the support or an existing nonzero element shrinks 
to zero. These critical values of r are easy to calculate at any point along the homotopy path. For every homotopy 
step, we jump from one critical value of r to the next while updating the support of the solution, until r has been 
lowered to its desired value. 

In every homotopy step, the update direction and the step-size for moving to a smaller critical value of r can be 
easily calculated using certain optimality conditions, which can be derived using the subgradient of the objective 
in (2) [26,28]. At any given value of r, the solution x* for (2) must satisfy the following optimality conditions: 

Ap (Ax* — y) = — rz (5a) 
||Af e (Ax*-y)|| 0O <r, (5b) 

where V denotes the support of x* , z denotes the sign sequence of x* on T, and Ar denotes a matrix with columns 
of A at indices in the set V. The optimality conditions in (5) can be viewed as N constraints that the solution x* 
needs to satisfy (with equality on the support T and strict inequality elsewhere). As we reduce r to r — 5, for a 
small value of S, the solution moves in a direction <9x, which to maintain optimality must obey 

Af (Ax* - y) + 5Ap Adx = — (r — 5)z (6a) 
|| A T (Ax* -y)+(5A T A5x|| 00 < (t-S). (6b) 

— i — 

The update direction that keeps the solution optimal as we change S can be written as 

fc f(A f A r )-z o„r (7) 
10 otherwise. 

We can move in direction <9x until one of the constraints in (6b) is violated, indicating we must add an element to 
the support T, or one of the nonzero elements in x* shrinks to zero, indicating we must remove an element from T. 
The smallest step-size that causes one of these changes in the support can be easily computed as 5* = min(<5 + , S~), 
where 

x+ ■ f r - Pi -t - pA 

<V = mm -, — (8a) 

ier<= Vl + di -l + dij + 

and <T = min f ^ ) , (8b) 

ier \dxi ) + 

and min(-) + means that the minimum is taken over only positive arguments. 5 + is the smallest step-size that causes 
an inactive constraint to become active at index 7 + , indicating that 7 + should enter the support, and 6~ is the 
smallest step-size that shrinks an existing element at index 7" to zero. The new critical value of r becomes r — 5* 
and the new signal estimate x* becomes x* + <5*<9x, and its support and sign sequence are updated accordingly. 
We can now recompute the update direction and the step-size that define the next step in the homotopy path and 
the consequent one-element change in the support. We repeat this procedure until r has been lowered to its desired 
value. 

The main computational cost of every homotopy step comes from computing <9x by solving an S x S system of 
equations in (7) (where S denotes the size of the support V) and from computing the vector d in (5b) that is used 
to compute the step-size 5 in (8). Since we know the values of d on V by construction and 9x is nonzero only 
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Algorithm 1 Iterative reweighting via homotopy 

Require: A, y, x, w, and w 
Ensure: x* 



l: Initialize: e = 0, x* ^— x 
2: repeat 

3: Compute <9x in (13) 

4: Compute p, d, q, and s in (12b) 

5: Compute 5* in (14) 

6: x* ^— x* + 5dx 

7: if 5* = 5' then 

8: r <- r\ 7 - 

9: else 

io: r<-ru7 + 

11: end if 
12: until e = 1 



[> Update direction 

[> Step size 
> Update the solution 

> Remove an element from the support 

> Add a new element to the support 



on r, the cost for computing d is same as one application of an M x N matrix. Moreover, since T changes by a 
single element at every homotopy step, instead of solving the linear system in (7) from scratch, we can efficiently 
compute 9x using a rank-one update at every step: 

> Update matrix inverse: We can derive a rank-one updating scheme by using matrix inversion lemma to explicitly 
update the inverse matrix (Ap Ap) -1 , which has an equivalent cost of performing one matrix- vector product 
with anMxS and one with an S x S matrix and adding a rank-one matrix to (Ap Ar)" 1 . The update direction 
<9x can be recursively computed with a vector addition. The total cost for rank-one update is approximately 
MS + 2S 2 flops. 

> Update matrix factorization: Updating the inverse of matrix often suffers from numerical stability issues, 
especially when S becomes closer to M (i.e, the number of columns in Ap becomes closer to the number of 
rows). In general, a more stable approach is to update a Cholesky factorization of Ap Ap (or a QR factorization 
of Ar) as the support changes [29, Chapter 12], [30, Chapter 3]. The computational cost for updating Cholesky 
factors and <9x involves nearly MS + 3S 2 flops. 

As such, the computational cost of a homotopy step is close to the cost of one application of each A and A T (that 
is, close to MN + MS + 3S 2 + 0(N) flops, assuming S elements in the support). 



B. Iterative reweighting via homotopy 

In this section, we present a homotopy algorithm for iterative reweighting that quickly updates the solution of 
(3) as the weights Wj change. Suppose we have solved (3) for a given set of Wj and now we wish to solve the 
following problem: 

minimize y^Wj|xj| + - 1| Ax — y||2, (9) 
x ^ — ' 2 

i 

where the Wj are the new weights. To incorporate changes in the weights (i.e., replace the Wj with the Wj) and 
quickly compute the new solution of (3), we propose the following homotopy program: 

minimize ^((1 - e)wj + eWj)|xj| + ^||Ax - y|||, (10) 

i 

where e denotes the homotopy parameter that we change from zero to one to phase in the new weights and phase out 
the old ones. As we increase e, the solution of (10) follows a homotopy path from the solution of (3) to that of (9). 
We show below that the path the solution takes is also piecewise linear with respect to e, making every homotopy 
step computationally inexpensive. The pseudocode outlining the important steps is presented in Algorithm 1. 
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At any value of e, the solution x* must obey the following optimality conditions: 

af (Ax* - y) = -((1 - e)wi + ew^Zj, for all i G T, (11a) 
and |af(Ax* - y)| < (1 - e)wj + w„ for all i G T c , (lib) 

where a« denotes ith column of A. As we increase e to e + 5, for some small 5, the solution moves in a direction 
<9x and the optimality conditions change as 

Af (Ax* - y) + (5Af Adx = -((1 - e)W + eW)z + <5(W - W)z (12a) 
| af (Ax* - y) +<5af A<9x | < (1 - e)w; + ew ! +5 (w* - Wj), (12b) 

N v ' N v ' V v ' N V ' 

Pi di q ; S; 

where W and W denote |T| x |T| diagonal matrices with their diagonal entries being the values of w and w on 
r, respectively. The update direction is specified by the new optimality conditions (12a) as 

f(AfAr)->(W-W). o„r 
[0 on r c . 

As we increase S, the solution moves in the direction <9x until either a new element enters the support of the 
solution (when a constraint in (12b) becomes active) or an existing element shrinks to zero. The stepsize that takes 
the solution to such a critical value of e can be computed as 5* = mm(5 + , 5~), where 

(T = mm — , — (14a) 

ier- \-Si + di Si + di J + 

<r=min(^M . (14b) 
ier V dxi J + 

5 + denotes the smallest step-size that causes a constraint in (12b) to become active, indicating entry of a new 
element at index 7+ in the support, whereas 5~ denotes the smallest step-size that shrinks an existing element at 
index 7" to zero. The new critical value of e becomes e + 5*, the signal estimate x* becomes x* + <5*cbc, where 
its support and sign sequence are updated accordingly. At every homotopy step, we jump from one critical value 
of e to the next while updating the support of the solution, until e = 1. 

The main computational cost of every homotopy step comes from solving a |T| x |T| system of equations to 
compute <9x in (13) and one matrix-vector multiplication to compute the dj in (14). Since T changes by a single 
element at every homotopy step, the update direction can be computed using a rank-one update. As such, the 
computational cost of each homotopy step is close to one matrix-vector multiplication with A and one with A T . 
We demonstrate with experiments in Sec. Ill that as the Wj change, our proposed homotopy algorithm updates the 
solution in a small number of homotopy steps, and the total cost for updating weights is just a small fraction of 
the cost for solving (3) from scratch. 

C. Adaptive reweighting via homotopy 

In this section, we present a homotopy algorithm that solves a weighted £i-norm minimization problem of the 
form (3) by adaptively selecting the weights Wj. The motivation for this algorithm is to perform reweighting at 
every homotopy step by updating the weights according to the changes in the solution and its support. Recall that 
in the standard LASSO homotopy we build the solution of (2) by adding or removing one element in the support 
while shrinking a single homotopy parameter r and . By comparison, each Wj in (3) can act as a separate homotopy 
parameter, and we can attempt to achieve desired values for Wj by adaptively shrinking them at every homotopy 
step. 

In adaptive reweighting, we trace a solution path for (3) by adaptively reducing the Wj while updating the solution 
and its support in a sequence of inexpensive homotopy steps. At every homotopy step, we start with a solution of 
(3) for certain values of Wj. We encourage the algorithm to focus on the set of active indices in the solution (i.e., 
the support of the solution) and reduce the Wj so that they decrease at a faster rate and achieve smaller values on 
the active set than on the inactive set. Suppose, using certain criterion, we select the Wj as the desired values of the 
weights. As we change the Wj toward the Wj, the solution moves in a certain direction until either the Wj become 
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Algorithm 2 Adaptive reweighting via homotopy 

Require: A, y and r 
Ensure: x*, w 



Initialize: x* ^— 0, Wj «— maxj \af y| for alH, T 
repeat 

Select Wj 

Compute 3x in (17) 
Compute (5* in (19) 

x* ^— x* + <5*cbc 

Wj «- Wj + 5* (wj - Wj) for all i G r 
if 5* < 1 then 

r <- r\ 7 - 

else 

7+ = argmaxjgpc |af (Ax* - y)| 

r<-ru7 + 

end if 

Wj «— max.,- |aJ(Ax* — y)| for all i £ T c 
until max,- (w,) < r 



arg maxj |a| y| 



> Desired values for the weights 

[> Update direction 
[> Step size 
> Update the solution 
[> Update Wj on the active set 

> Remove an element from the support 

> Select the largest among inactive constraints 
> Add a new element to the support 

> Update Wj on the inactive set 



equal to the Wj or the support of the solution changes by one element. By taking into account any change in the 
support, we revise the values of Wj for the next homotopy step. We repeat this procedure until each Wj is reduced 
below a certain predefined threshold r > 0. 

In summary, we solve a single homotopy problem that builds the solution for a weighted l\ problem of the form 
(3) by adjusting the Wj according to the changes in the support of the solution. A pseudocode with a high-level 
description is presented in Algorithm 2. Details regarding weights selection, update direction, and step size and 
support selection are as follows. 

1) Weight selection criteria: Suppose we want to shrink the Wj in (3) toward a preset threshold r, and by 
construction, we want the Wj to have smaller values on the support of the solution (e.g., of the form (4)). At every 
homotopy step, we divide the indices into an active and an inactive set. We can follow a number of heuristics to 
select the desired values of the weights (wj) so that the Wj reduce at a faster rate on the active set than on the 
inactive set. 

Initialization: We initialize all the weights with a large value (e.g., Wj = maxj \af y| for all i) for which the 
solution is a zero vector. The only element in the active set correspond to argmaxj |afy|, where aj denotes zth 
column in A. 

Weights on the active set: We can select the Wj on the active set in a variety of ways. For instance, we can 
select each Wj as a fraction of the present value of the corresponding Wj as Wj «— Wj//3, for some j3 > 1, or as 
a fraction of the maximum value of the Wj on the active set as Wj <— maxj £ r Wj//3. The former will reduce each 
Wj on the active set at the same rate, while the latter will reduce each Wj to the same value as well. To introduce 
reweighting of the form in (4), we can change the Wj on the support using the solution from the previous homotopy 
step as Wj «— min(r, r//3|x*|), for some j3 > 1. 

Weights on the inactive set: We can assign the Wj on the inactive set a single value that is either equal to the 
maximum value of Wj on the active set or equal to r, whichever is the larger. 

In Fig. 1, we present three examples to illustrate the evolution of the Wj on the active and the inactive set 
at various homotopy steps. These examples were constructed with different choices of Wj during the recovery a 
Blocks signal of length 256 from 85 noisy Gaussian measurements according to the experimental setup described 
in Sec. III. We plotted the Wj at different homotopy steps in such a way that the left part of each plot corresponds 
to the active set and the right part to the inactive set of the solution. As the homotopy progresses, the support 
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(c) a hybrid of (a) and (b) 



Fig. 1: Illustrations of variations (on a log-scale) in the w,; on the sets of active and inactive indices at different homotopy 
steps. Subfigures (a), (b), and (c) correspond to three different choices for the w,;. Left part of each plot (with the lower values 
of Wi) corresponds to the active set of indices in the order in which they entered the support and the right part (with larger 
values of the w,) corresponds to the inactive set of the solution at every step. 



size increases and the Wj decrease, but the Wj on the active set become distinctly smaller than the rest. In Fig. la 
we selected Wj <— maxj e rWj/2 at every step; in Fig. lb we selected Wj <— min (r, r//3|x*|) at every step, with 
certain values of r and (3; and in Fig. lc we selected Wj maxj g rWj/2 for first few homotopy steps and then 
we selected Wj <— r//3|x*|. In our experiments in Sec. Ill, we selected weights according to the scheme illustrated 
in Fig. lb. 

2) Update direction: To compute the update direction in which the solution moves as we change the weights 
Wj toward the Wj, we use the same methodology that we used for (10) in Sec. II-B. For any given values of the 
Wj, a solution x* for (3) satisfies the following optimality conditions: 

af (Ax* - y) = -WjZj, for all i £ T, (15a) 
and |af (Ax* - y)| < w<, for all i £ T c , (15b) 

where F denotes the support of x* and z denotes the sign sequence of x* on T. If we change Wj toward Wj along 
a straight line, (1 — <5)wj + <5wj, the solution moves in a direction <9x, which to maintain optimality must obey 

Af (Ax* - y) + 6A% Adx = -Wz + «5(W - W)z, (16a) 
|af (Ax* - y) + <5af Adx| < Wi + <5(w 4 - Wj), for all i 6 r c , (16b) 

where W and W denote |T| x |T| diagonal matrices constructed with the respective values of Wj and Wj on T. 
Subtracting (15a) from (16a) yields the following expression for the update direction dx: 

5x=/ (A r Ar) ~ 1(W - W)z ° nr (17) 
[0 on r c . 

3) Step size and support selection: As we increase 5 from to 1, x* moves along the update direction <9x as 
x* + <5<9x and the Wj change toward w« as Wj + <5(wj — Wj). At certain value of 5 G (0, 1), an existing element in 
x* may shrink to zero and we must remove that element from the support. Alternatively, an inactive constraint in 
(16b) may become active, and to maintain the optimality of the solution, we must either include that element in 
the support or increase the value of Wj at that index. 

The optimality conditions (15b) and (16b) suggest that as long as a strict inequality is maintained for an index 
i in the inactive set, we can change the corresponding weight to an arbitrary value without affecting the solution. 
Sine the values of Wj are not fixed a priori in this scheme, we have the flexibility to disregard any violation of the 
inequality constraints and adjust the Wj so that the solution remains optimal. Under this setting, we can compute 
the optimal stepsize 5* and identify a change in the support of the signal as follows. 
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The smallest positive value of 6 that causes an element in x* to shrink to zero is 



«5-=min — -M , (18) 

suppose at an index 7~ G T. If S~ < 1, we must remove 7~ from the support and set 5* = 5~ . If 5~ > 1, we set 
5* = 1 and select a new element 7+ to add to the support. We set 

S* = min(<T,l), (19) 

x* = x* + (5*<9x, and w, = + <5*(wj — Wj). If S~ > 1, we select the new element 7+ that corresponds to the 
inactive constraint with largest magnitude, which can be determined as 

7 + = argmax |af(Ax* — y)|, (20) 

and set w 7 + = |a^+(Ax* — y)|. We update the Wj, x*, and V accordingly. 

We repeat the procedure of selecting Wj, computing the update direction and the stepsize, and updating the 
solution and its support at the every homotopy step, until a termination criterion is satisfied (e.g. Wj < r for all i). 

The main computational cost at every homotopy step comes from solving a |T| x |T| system of equations in (17) 
for computing <9x and one matrix- vector multiplication whenever we need to find 7+ in (20). Since Y changes by 
a single element at every homotopy step, the update direction can be efficiently computed using a rank-one update. 
As such, the computational cost of every step is equivalent one matrix-vector multiplication with A and one with 



III. Numerical experiments 

We present some experiments to demonstrate the performance of our proposed algorithms: (1) iterative reweighting 
via homotopy (Algorithm 1), which we will call IRW-H and (2) adaptive reweighting via homotopy (Algorithm 2), 
which we will call ARW-H. We evaluate the performances of ARW-H and IRW-H in terms of the computational 
cost and the reconstruction accuracy. We show that, in comparison with iterative reweighting schemes, solving (3) 
using ARW-H yields significantly higher quality signal reconstruction, at a computational cost that is comparable 
to solving (3) one time from scratch. Furthermore, we show that, using IRW-H, we can quickly update the weights 
in (3) at a small computational expense. To compare ARW-H and IRW-H against existing l\ solvers, we also 
present results for sparse signal recovery using iterative reweighting for three state-of-the-art solvers 1 : YALL1 [20], 
SpaRSA [17], and SPGL1 [16] in which we used old solutions as a "warm start" at every iteration of reweighting. 
We show that IRW-H outperforms YALL1, SpaRSA, and SPGL1 in terms of the computational cost for iterative 
reweighting, while ARW-H yields better overall performance in terms of both the computational cost and the 
reconstruction accuracy. 



A. Experimental setup 

We compared the performances of the algorithms above in recovering two types of sparse signals from noisy, 
random measurements that were simulated according to the model in (1). We generated sparse signals by applying 
wavelet transforms on the modified forms of "Blocks" and "HeaviSine" signals from the Wavelab toolbox [34] as 
described below. 

i. Blocks: We generated a piecewise-constant signal of length N by randomly dividing the interval [1, N] into 
1 1 disjoint regions. Setting the first region to zero, we iteratively assigned a random value to every region by 
adding an integer chosen uniformly in the range of [—5, 5] to the value from the previous region. We applied 
Haar wavelet transform on the piecewise-constant signal to generate a sparse signal x. An example of such 
a piecewise-constant signal and its Haar wavelet transform is presented in Fig. 2a. Because of the piecewise 
constant structure of these signals, the resulting Haar wavelet transforms will have only a small number of 
nonzero coefficients that depend on the number of discontinuities and the finest wavelet scale. Since we have 

'We selected these solvers for comparison because, among the commonly used i\ solvers [31-33], we found these to be the fastest and 
sufficiently accurate with a warm start. 
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(a) Blocks (b) HeaviSine 



Fig. 2: (a) An example of piecewise-constant (blocks) signal and its sparse representation using Haar wavelets, (b) An example 
of perturbed HeaviSine signal and its sparse representation using Daubechies-4 wavelets. 



fixed the number of discontinuities, the ratio of the number of nonzero elements to the length of the signal 
becomes smaller as the length of the signal (N) increases, 
ii. HeaviSine: We generated a sinusoidal signal with nearly two cycles and two jumps at random locations. First, 
we generated a sinusoidal signal of length N for which we selected the amplitude in the range of [4, 6] 
and number of cycles in the range [2, 2.5] uniformly at random. Then, we divided the signal into three non- 
overlapping regions and added a different Gaussian random variable to each region. We applied Daubechies 4 
wavelet transform on the resulting signal to generate the sparse signal x. An example of such a sinusoidal 
signal with jumps and its Daubechies 4 wavelet transform is presented in Fig. 2b. In this type of signals, most 
of the wavelet coefficients in x will not be exactly zero, but if sorted in the decreasing order of magnitude, the 
coefficients quickly decay to extremely small values. Hence, this type of signals can be classified as near-sparse 
or compressible. 

In every experiment, we generated an M xN measurement matrix A with its entries drawn independently according 
to Af(0, l/\/M) distribution and added Gaussian noise vector e to generate the measurement vector as y = Ax + e. 
We selected each entry in e as i.i.d. M(0,a 2 ), where the variance cr 2 was selected to set the expected SNR with 
respect to the measurements Ax at 40 dB. We reconstructed solution x using all the algorithms according to the 
procedures described below. 

In our experiments, we fixed the parameter r = ay/log N, where a denotes the standard deviation of the 
measurement noise. Although the weights can be tuned according to the signal structure, measurement matrix, 
and noise level, we did not make such an attempt in our comparison. Instead, we adopted a general rule for 
selecting weights that provided good overall performance for all the solvers, in all of our experiments. We set up 
the algorithms in the following manner. 

i. ARW-H 2 : We solved a weighted £i-norm minimization problem of the form (3) following the procedure 

outlined in Algorithm 2, in which the exact values for the Wj are not known a priori as they are selected 

( T \ ll x *ll 2 
adaptively. In line 3 of Algorithm 2, we selected Wj min I r, — - — r ) using (3 <— M- — ^| at every step, 

V P\K\J H x *lli 

where x* denotes the solution from the previous homotopy step. We selected this value of (3 because it helps 
in shrinking the Wj to smaller values when M is large and to larger values when the solution is dense. (We 
are using the ratio of l\ to £2 norm as a proxy for the support size here.) The main computational cost at 
every step of ARW-H involves one matrix-vector multiplication for identifying a change in the support and 
a rank-one update for computing the update direction. We used the matrix inversion lemma-based scheme to 
perform the rank-one updates. 

ii. IRW-H 2 : We solved (3) via iterative reweighting in which we updated the solution at every reweighting iteration 
according to the procedure outlined in Algorithm 1. For the first iteration, we used standard LASSO homotopy 



2 MATLAB code added to the ^i-homotopy package available at http://users.ece.gatech.edu/~sasif/homotopy. Additional experiments 
and scripts to reproduce all the experiments in this paper can also be found on this webpage. 
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algorithm [35] to solve (2), which is equivalent to (3) when Wj = r for all i. Afterwards, at every reweighting 
iteration, we updated the Wj as 

where x denotes the solution from previous reweighting iteration and j3 > 1 and e > denote two parameters 
that can be used to tune the weights according to the problem. In our experiments, we fixed e = 1 and 

llxll^ 

updated f3 <— M-—— § at every reweighting iteration. The main computational cost at every step of IRW-H also 

Hi 

involves one matrix-vector multiplication and a rank-one update of a small matrix. We used matrix inversion 
lemma-based scheme to perform rank-one updates. 

iii. YALL1 3 : YALL1 is a first-order algorithm that uses an alternating direction minimization method for solving 
various l\ problems, see [20] for further details. We iteratively solved (3) using weighted-^i jli solver in YALL1 
package. For the initial solution, we solved (2) using YALL1. At every subsequent reweighting iteration, we 
used previous YALL1 solution to renew the weights according to (21) and solved (3) by providing the old 
solution as a warm-start to YALL1 solver. We fixed the tolerance parameter to 1CT 4 in all the experiments. 
The main computational cost of every step in the YALL1 solver comes from applications of A and A T . 

iv. SpaRSA 4 : SpaRSA is also a first-order method that uses a fast variant of iterative shrinkage and thresholding 
for solving various l\ -regularized problems, see [17] for further details. Similar to IRW-H and YALL1, we 
iteratively solved (3) using SpaRSA, while updating weights using the old solution in (21) and using the 
old solution as a warm-start at every reweighting iteration. We used the SpaRSA code with default adaptive 
continuation procedure in the Safeguard mode using the duality gap-based termination criterion for which we 
fixed the tolerance parameter to 10 -4 and modified the code to accommodate weights in the evaluation. The 
main computational cost for every step in the SpaRSA solver also involves applications of A and A T . 

v. SPGL1 5 : SPGL1 solves an equivalent constrained form of (3) by employing a root- finding algorithm [16]. We 
solved the following problem using SPGL1: 



minimize 

X 



N 

y^Wj|xj| subject to || Ax — y || 2 < A, (22) 

i=i 

in which we used A = o\jM. For the initial solution, we solved (22) using Wj = 1 for all i. At every 
subsequent reweighting iteration, we used previous SPGL1 solution to renew the weights according to (21) 
(using r = 1) and solved (22) using the old solution as a warm start. We solved SPGL1 using default parameters 
with optimality tolerance set at 10~ 4 . The computational cost of every step in SPGL1 is also dominated by 
matrix-vector multiplications. 
To summarize, ARW-H solves (3) by adaptively selecting the values of Wj, while IRW-H, YALL1, and SpaRSA 
iteratively solve (3) and SPGL1 iteratively solves (22), using updated values of Wj at every reweighting iteration. 

We used MATLAB implementations of all the algorithms and performed all the experiments on a standard 
desktop computer using MATLAB 2012a. We used a single computational thread for all the experiments, which 
involved recovery of a sparse signal from a given set of measurements using all the candidate algorithms. In every 
experiment, we recorded three quantities for each algorithm: 1) the quality of reconstructed signal in terms of 
signal-to-error ratio in dB, defined as 

SER = 201og 10 I|X| ^ , 
ll x _ x lb 

where x and x denote the original and the reconstructed signal, respectively, 2) the number of matrix-vector products 
with A and A T , and 3) the execution time in MATLAB. 

B. Results 

We compared performances of ARW-H, IRW-H, YALL1, SpaRSA, and SPGL1 for the recovery of randomly 
perturbed Blocks and HeaviSine signals from random, noisy measurements. We performed 100 independent trials for 

3 MATLAB package for YALL1 is available at http://yalll.blogs.rice.edu/. 
4 MATLAB package for SpaRSA is available at http://lx.it.pt/~mtf/SpaRSA/. 
5 MATLAB package for SPGL1 is available at http://www.cs.ubc.ca/labs/scl/spgll 
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(a) SER — Blocks 



(b) SER — HeaviSine 



Fig. 3: Comparison of SER for the recovery of sparse signals: (Left) Haar wavelet transform of randomly perturbed Blocks 
signals and (Right) Daubechies 4 wavelet transform of randomly perturbed HeaviSine signals, measured with M x TV Gaussian 
matrices in the presence of Gaussian noise at 40 dB SNR. ARW-H solves adaptive-reweighted l\ problem once, while other 
methods solve unweighted l\ problem in their first iteration and perform five reweighted iterations afterwards. (First row) 
SER for the solution after first iteration. (Second row) SER for solutions after five reweighting iterations (SER for ARW-H is 
copied from the top row) 



each of the following combinations of N and M: iV = [256, 512, 1024] and M = [N/2, N/2.5, N/3, N/3.5, N/4]. 
In each experiment, we recovered a solution x from simulated noisy, random measurements using all the algorithms, 
according to the procedures described above, and recorded corresponding SER, number of matrix-vector products, 
and MATLAB runtime. The results, averaged over all the trials, for each combination of M and iV are presented 
in Figures 3a-5a (for Blocks signals) and Figures 3b-5b (for HeaviSine signals). 

Comparison of SERs for the solutions of all the algorithms at different values of and M is presented in 
Fig. 3a (for Blocks signals) and Fig. 3b (for HeaviSine signals). Three plots in the first row depict SERs for the 
solutions after first iteration of all the algorithms. Since ARW-H solves a weighted ^i-norm formulation (as in (3)) 
via adaptive reweighting, its performance is superior to all the other algorithms, which solve unweighted £i-norm 
problems in their first iteration. Since SPGL1 solves the l\ problem in (22), its performance is slightly different 
compared to IRW-H, YALL1, and SpaRSA, all of which solve (3) and should provide identical solutions if they 
converge properly. The plots in the second row present SERs for the solutions after five reweighting iterations of all 
the algorithms except ARW-H, which was solved only once. As we can see that the solutions of ARW-H display 
the best SERs in all these experiments. Although SERs for the solutions of IRW-H, YALL1, SpaRSA, and SPGL1 
improve with iterative reweighting, in some cases there is a significant gap between their SERs and that of ARW-H. 

Comparison of the computational cost of all the algorithms in terms of the number of matrix-vector multiplications 
is presented in Fig. 4a (for Blocks signals) and Fig. 4b (for HeaviSine signals). We counted an application of each 
A and A T as one application of A T A 6 . Three plots in the first row present the count of A T A applications that 
each algorithm used for computing the initial solution. Second row depicts the count of A T A applications summed 
over five reweighting iterations in each of the algorithm. Since we solved ARW-H just once, the count for ARW-H 
is zero and does not appear in the second row. Third row presents total count of A T A applications, which is the 
sum of the counts in the first and the second row. We can see in the second row that, compared to YALL1, SPGL1, 
and SpaRSA, IRW-H used a distinctly smaller number of matrix-vector products for updating the solution as the 
weights change in iterative reweighting. The final count in the third row shows that ARW-H consumed the least 

6 For the homotopy algorithms, we approximated the cost of one step as one application of A T A. 
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Fig. 4: Comparison of the number of matrix-vector products for the recovery. ARW-H solves adaptive-reweighted l\ problem 
once, while other methods solve unweighted ii problem in their first iteration and perform five reweighted iterations afterwards. 
(First row) Count for the first iteration only. (Second row) Count for all the reweighting iterations (ARW-H does not appear 
because its count is zero). (Third row) Count for all the iterations. 



number of total A T A applications in all the cases. 

Comparison of MATLAB runtime for all the algorithms is presented in Fig. 5a (for Blocks signals) and Fig. 5b (for 
HeaviSine signals). The first row presents runtime that each algorithm utilized for computing the initial solution, the 
second row presents execution time for five reweighting iterations, and the third row presents total time consumed 
by each of the recovery algorithms. As we can see in the second row that, compared to YALL1, SPGL1, and 
SpaRSA, IRW-H consumed distinctly lesser time for updating solutions in iterative reweighting. In the third row, 
we see small difference in the total runtime for IRW-H, SpaRSA, and YALL1, where IRW-H and SpaRSA display 
comparable performance. Nevertheless, in all the experiments, the total runtime for ARW-H is the smallest among 
all the algorithms. 

A brief summary of the results for our experiments is as follows. We observed that the adaptive reweighting 
scheme (ARW-H) recovered signals with better SERs compared to the iterative reweighting schemes (IRW-H, 
SpaRSA, YALL1, and SPGL1), and it does so by solving a single homotopy problem at the expense of a small 
amount of computational cost and time. Among the iterative reweighting schemes, IRW-H quickly updated the 
solutions during iterative reweighting at the expense of marginal computational cost and time, which are distinctly 
smaller than the respective costs and times for SpaRSA, YALL1, and SPGL1; although SpaRSA and YALL1 with 
warm-start provided competitive results for longer signals. 

IV. Discussion 

We presented two homotopy algorithms that can efficiently solve reweighted l\ problems. In Sec. II-B, we 
presented an algorithm for updating the solution of (3) as the w, change. We demonstrated with experiments that, 
in reweighting iterations, our proposed algorithm quickly updates the solution at a small computational expense. In 
Sec. II-C, we presented a homotopy algorithm that adaptively selects weights inside a single homotopy program. We 
demonstrated with experiments that our proposed adaptive reweighting method outperforms iterative reweighting 
methods in terms of the reconstruction quality and the computational cost. 
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Fig. 5: Comparison of MATLAB runtime for the recovery. (First row) Time for the first iteration. ARW-H solves adaptive- 
reweighted l\ problem once, while other methods solve unweighted l\ problem in their first iteration and perform five reweighted 
iterations afterwards. (Second row) Time for all the reweighted iterations. (Third row) Total runtime. 



We would like to mention that an adaptive reweighting scheme, similar to the one we presented for homotopy, 
can also be embedded inside iterative shrinkage-thresholding algorithms [17, 18, 33, 36, 37]. A recent paper [38] has 
also employed a similar principle of reweighting inside SPGL1. Standard iterative shrinkage-thresholding algorithms 
solve the program in (3) by solving the following shrinkage problem at every inner iteration: 

minimize —||x — u||2 + Wj|xj|, (23) 
x 2 ^ — ' 

where u = x fc_1 — -^A T (Ax fc_1 — y) denotes a vector that is generated using a solution x^ 1 from a previous 
iteration and L determines the stepsize. The solution of (23) involves a simple soft-thresholding of the entries in 
u with respect to the Wj/L (i.e., Xj = soft(uj, Wj/L), where soft(u, a) = sign(u) max{|-u| — a,0} defines the 
soft-thresholding/shrinkage function). To embed adaptive reweighting inside such shrinkage algorithms, instead of 
using a fixed set of Wj for soft-thresholding in (23) at every iteration, we can adaptively select the Wj according 
to the changes in the solution. 

To examine our suggestion, we added an adaptive reweighting scheme in the source code of SpaRSA. SpaRSA 
offers a feature for adaptive continuation in which it starts the shrinkage parameter r with a large value and 
decreases it towards the desired value after every few iterations. We added an extra line in the code that uses 
the available solution x to update each w, as min (r, t//3|x,|) whenever r changes. We gauged the performance 
of this modified method by using it for the recovery of gray-scale images from compressive measurements. The 
problem setup following the model in (1) is as follows. We generated a sparse signal x of length N by applying 
a Daubechies 9/7 biorthogonal wavelet transform [39, 40] with odd-symmetric extension on an image, selected A 
as a subsampled noiselet transform [41], and added Gaussian noise e in the measurements by selecting each entry 
in e as i.i.d. W(0, a 2 ). 

In Fig. 6 we present results averaged over 10 experiments that we performed for the recovery of three 256 x 256 
images from M = 30,000 noiselet measurements in the presence of noise at 40 dB SNR using SpaRSA in three 
different ways. The first column presents the original 256 x 256 images. The second column presents small portions 
of the images that we reconstructed by solving the standard i\ problem in (2) using r = o^/\og N. The peak 
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Fig. 6: Results for the recovery of 256 x 256 images from M = 30, 000 noiselet measurements in the presence of Gaussian noise 
at 40dB SNR, using Daubechies 9/7 biorthogonal wavelet transform with odd-symmetric extensions as the sparse representation. 
(Column 1) Original images. (Column 2) Portions of the images (inside the orange box) reconstructed by solving (2) using 
SpaRSA. (Column 3) Reconstruction after three reweighting iterations. (Column 4) Adaptive reweighting by updating the 
w, after every continuation step in SpaRSA. The caption under each subimage shows the PSNR over the entire reconstructed 
image and a count for the number of applications of A T A (in parentheses) averaged over 10 experiments. 



signal-to-noise ratio (PSNR) for the entire reconstructed image is presented in each caption along with the total 
count for the number of applications of A T A (in parentheses) averaged over 10 experiments. The third column 
presents portions of the reconstructed images after three reweighting iterations using the warm-start and weight 
selection procedure employed in the experiments in Sec. III. The last column presents portions of the reconstructed 
images by solving the modified version of SpaRSA in which we initialized the SpaRSA code by setting all the 
weights to a same value, and after every continuation iteration we modified the values of weights according to the 
available solution. We observe that SpaRSA with this adaptive reweighting modification yields better performance 
in terms of PSNR over the other two methods, while its computational cost is significantly smaller than the cost 
of iterative reweighting (in column 2). 

We have presented these results as a proof-of-concept and we anticipate that adding a simple adaptive reweighting 
scheme within existing iterative shrinkage algorithms can potentially enhance their performance in many scenarios 
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without any additional cost; however, a detailed study in this regard is beyond the scope of this paper. 
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